Optimal Matrix Multiplication on Fault-Tolerant VLSI Arrays
نویسندگان
چکیده
where h is the decision tree height, or the number of radial cuts, whose i cross edges can be chosen independently: h = (f L-k-l) / (f-1). Therefore, the number of permutations having the number of cross edges i = 1 in a stage-k subgraph of an (3, L) CC-banyan is C;,k=3(3L-k-1)/2, 0 < k < L-2. Now we can evaluate the total number of permutations in CC-banyan with fan-out 3. Theorem 6: The number of permutations in an (f , L) CC-banyan with fan-out f = 3 is ProoJ The expression for the total number of permutations The method of decision trees can also be used for enumeration of all permutations in the case of general fan-out f. We have been able to evaluate the number of permutations in CC-banyans with fan-out f = 4. However, the process of constructing decision trees becomes quite complex and elaborate for large values off. This is due to the fact that the analysis of all possible permutations in a radial sector of size (f-1) becomes a complex combinatorial problem by itself, when f is large. Even so, for practical purposes, the number of permutations for larger fan-outs can be easily obtained by combining analytical expressions (2) and (3) with computer-generated exhaustive enumera-tion for Ci,k. V. CONCLUSIONS In this correspondence, we presented a graph-theoretic approach to the analysis of rectangular CC-banyan networks with an arbitrary fan-out f and an arbitrary number of stages L. It is shown that CC-banyans, like many other networks (e.g., SW-banyans, Benes networks) can be constructed recursively from the networks of smaller size. This recursiveness enables the modular structure of CC-banyans and can be used in the analysis of its partitioning properties. We also presented a general approach to the analysis of permuting properties of CC-banyans. The analytical expressions for the number of permutations performable by CC-banyan with fan-outs 2 and 3 are derived. In the case of general fan-out f , we proposed a method, based on decision tree analysis, for a systematic enumeration of all permutations. Abstract-The Diogenes methodology, proposed by Rosenberg, for the design of easily testable and configurable fault-tolerant VLSI arrays, results in collinear layouts of processors (PE's) that are configured into the desired array structure by appropriate switch settings on buses running parallel to the PE's. While possessing attractive mechanisms for fault-tolerant implementations, Diogenes designs of two-dimensional (2-D) arrays require more area than a …
منابع مشابه
Design of algorithm-based fault-tolerant VLSI array processor - Computers and Digital Techniques [see also IEE Proceedings-Computers and Digital Techniques], IEE
In the paper a systematic design methodology which maps a matrix arithmetic algorithm to a fault-tolerant array processor with different topologies and dimensions is presented. The design issues to be addressed in the method are: (a) how to derive a VLSI array with different topologies and dimensions from the algorithm; (b) how to distribute the data processing to the PES so that a faulty PE wi...
متن کاملAsymptotically Tight Bounds for Computing with Faulty Arrays of Processors (Extended Abstract)
In the paper, we analyze the computational power of 2 and 3-dimensional processor arrays that contain a potentially large number of faults. We consider both a random a and worst-case fault model, and we prove that in either scenario, low-dimensional arrays are surprisingly fault-tolerant. For example, we show how to emulate an n e x n m fault-free array on an n x n array containing Q(n2) random...
متن کاملThe Robust-Algorithm Approach to Fault Tolerance on Processor Arrays: Fault Models, Fault Diameter, and Basic Algorithms
With few exceptions, the two issues of algorithm design and fault tolerance for processor arrays have been dealt with separately, in that algorithm developers have assumed the availability of complete fault-free arrays and fault tolerance techniques have aimed at restoring such complete arrays by reconfiguring faulty ones. We present the design of robust algorithms that run efficiently on compl...
متن کامل2D matrix multiplication on a 3D systolic array
The introduction of systolic arrays in the late 1970s had an enormous impact on the area of special purpose computing. However, most of the work so far has been done with onedimensional and two-dimensional (2D) systolic arrays. Recent advances in three-<limensional VLSI (3D VLSI) and 3D packaging of2D VLSI components, has made the idea of 3D systolic arrays feasible in the near future. In this ...
متن کاملFault-tolerant parallel matrix multiplication with one iteration fault detection latency
The checksum technique is a low cost method to detect errors in matrix operations performed by processor arrays. The fault detection of this method is done only at problem termination, so this method is not an effective fault tolerance technique for large scale matrix multiplication. This paper presents a new algorithm, the ID algorithm, which minimizes the fault-detection latency, In the ID al...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Trans. Computers
دوره 38 شماره
صفحات -
تاریخ انتشار 1989